Does the inclusion of rare variants improve risk prediction?

نویسندگان

  • Erin Austin
  • Wei Pan
  • Xiaotong Shen
چکیده

Every known link between a genetic variant and blood pressure improves the understanding and potentially the risk assessment of related diseases such as hypertension. Genetic data have become increasingly comprehensive and available for an increasing number of samples. The availability of whole-genome sequencing data means that statistical genetic models must evolve to meet the challenge of using both rare variants (RVs) and common variants (CVs) to link previously unidentified genome loci to disease-related traits. Penalized regression has two features, variable selection and proportional coefficient shrinkage, that allow researchers to build models tailored to hypothesized characteristics of the genotype-phenotype map. The following work uses the Genetic Analysis Workshop 18 data to investigate the performance of a spectrum of penalized regressions using at first only CVs or only RVs to predict systolic blood pressure (SBP). Next, combinations of CVs and RVs are used to model SBP, and the impact on prediction is quantified. The study demonstrates that penalized regression improves blood pressure prediction for any combination of CVs and RVs compared with maximum likelihood estimation. More significantly, models using both types of variants provide better predictions of SBP than those using only CVs or only RVs. The predictive mean squared error was reduced by up to 11.5% when RVs were added to CV-only penalized regression models. Elastic net regression with equally weighted LASSO and ridge components, in particular, can use large numbers of single-nucleotide polymorphisms to improve prediction.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Disease risk prediction with rare and common variants

A number of studies have been conducted to investigate the predictive value of common genetic variants for complex diseases. To date, these studies have generally shown that common variants have no appreciable added predictive value over classical risk factors. New sequencing technology has enhanced the ability to identify rare variants that may have larger functional effects than common varian...

متن کامل

Prediction of Secondary Structure of Citrus Viroids Reported from Southern Iran

Abstract Viroids are smallest, single-stranded, circular, highly structured plant pathogenic RNAs that do not code for any protein. Viroids belong to two families, the Avsunviroidae and the Pospiviroidae. Members of the Pospiviroidae family adopt a rod-like secondary structure. In this study the most stable secondary structures of citrus viroid variants that reported from Fars province wer...

متن کامل

Risk Prediction Modeling of Sequencing Data Using a Forward Random Field Method

With the advance in high-throughput sequencing technology, it is feasible to investigate the role of common and rare variants in disease risk prediction. While the new technology holds great promise to improve disease prediction, the massive amount of data and low frequency of rare variants pose great analytical challenges on risk prediction modeling. In this paper, we develop a forward random ...

متن کامل

Does body composition play a role in predicting sports injuries? A systematic review

Introduction: This study aimed to review the literature on the role of body composition as arisk factor for injury in an athletic population.Materials and Methods: We searched articles in English in Google Scholar Science direct,PubMed, WOS, Scopus, ProQuest, and Cochrane Library databases without time limit until2020 using keywords rela...

متن کامل

Collapsing ROC approach for risk prediction research on both common and rare variants

Risk prediction that capitalizes on emerging genetic findings holds great promise for improving public health and clinical care. However, recent risk prediction research has shown that predictive tests formed on existing common genetic loci, including those from genome-wide association studies, have lacked sufficient accuracy for clinical use. Because most rare variants on the genome have not y...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2014